Search CORE

22 research outputs found

Context-aware Neural Machine Translation for English-Japanese Business Scene Dialogues

Author: Fernandes Patrick
Honda Sumire
Zerva Chrysoula
Publication venue
Publication date: 20/11/2023
Field of study

Despite the remarkable advancements in machine translation, the current sentence-level paradigm faces challenges when dealing with highly-contextual languages like Japanese. In this paper, we explore how context-awareness can improve the performance of the current Neural Machine Translation (NMT) models for English-Japanese business dialogues translation, and what kind of context provides meaningful information to improve translation. As business dialogue involves complex discourse phenomena but offers scarce training resources, we adapted a pretrained mBART model, finetuning on multi-sentence dialogue data, which allows us to experiment with different contexts. We investigate the impact of larger context sizes and propose novel context tokens encoding extra-sentential information, such as speaker turn and scene type. We make use of Conditional Cross-Mutual Information (CXMI) to explore how much of the context the model uses and generalise CXMI to study the impact of the extra-sentential context. Overall, we find that models leverage both preceding sentences and extra-sentential context (with CXMI increasing with context size) and we provide a more focused analysis on honorifics translation. Regarding translation quality, increased source-side context paired with scene and speaker information improves the model performance compared to previous work and our context-agnostic baselines, measured in BLEU and COMET metrics.Comment: MT Summit 2023, research track, link to paper in proceedings: https://aclanthology.org/2023.mtsummit-research.23

arXiv.org e-Print Archive

BLEU Meets COMET: Combining Lexical and Neural Metrics Towards Robust Machine Translation Evaluation

Author: Glushkova Taisiya
Martins André F. T.
Zerva Chrysoula
Publication venue
Publication date: 30/05/2023
Field of study

Although neural-based machine translation evaluation metrics, such as COMET or BLEURT, have achieved strong correlations with human judgements, they are sometimes unreliable in detecting certain phenomena that can be considered as critical errors, such as deviations in entities and numbers. In contrast, traditional evaluation metrics, such as BLEU or chrF, which measure lexical or character overlap between translation hypotheses and human references, have lower correlations with human judgements but are sensitive to such deviations. In this paper, we investigate several ways of combining the two approaches in order to increase robustness of state-of-the-art evaluation methods to translations with critical errors. We show that by using additional information during training, such as sentence-level features and word-level tags, the trained metrics improve their capability to penalize translations with specific troublesome phenomena, which leads to gains in correlation with human judgments and on recent challenge sets on several language pairs.Comment: Accepted at EAMT 202

arXiv.org e-Print Archive

Learning Disentangled Representations of Negation and Uncertainty

Author: Ananiadou Sophia
Miwa Makoto
Vasilakes Jake
Zerva Chrysoula
Publication venue
Publication date: 01/01/2022
Field of study

Negation and uncertainty modeling are long-standing tasks in natural language processing. Linguistic theory postulates that expressions of negation and uncertainty are semantically independent from each other and the content they modify. However, previous works on representation learning do not explicitly model this independence. We therefore attempt to disentangle the representations of negation, uncertainty, and content using a Variational Autoencoder. We find that simply supervising the latent representations results in good disentanglement, but auxiliary objectives based on adversarial learning and mutual information minimization can provide additional disentanglement gains.Comment: Accepted to ACL 2022. 18 pages, 7 figures. Code and data are available at https://github.com/jvasilakes/disentanglement-va

arXiv.org e-Print Archive

The University of Manchester - Institutional Repository

Non-Exchangeable Conformal Risk Control

Author: Farinhas António
Martins André F. T.
Ulmer Dennis
Zerva Chrysoula
Publication venue
Publication date: 26/01/2024
Field of study

Split conformal prediction has recently sparked great interest due to its ability to provide formally guaranteed uncertainty sets or intervals for predictions made by black-box neural models, ensuring a predefined probability of containing the actual ground truth. While the original formulation assumes data exchangeability, some extensions handle non-exchangeable data, which is often the case in many real-world scenarios. In parallel, some progress has been made in conformal methods that provide statistical guarantees for a broader range of objectives, such as bounding the best

F_1

-score or minimizing the false negative rate in expectation. In this paper, we leverage and extend these two lines of work by proposing non-exchangeable conformal risk control, which allows controlling the expected value of any monotone loss function when the data is not exchangeable. Our framework is flexible, makes very few assumptions, and allows weighting the data based on its relevance for a given test example; a careful choice of weights may result on tighter bounds, making our framework useful in the presence of change points, time series, or other forms of distribution drift. Experiments with both synthetic and real world data show the usefulness of our method.Comment: ICLR 202

arXiv.org e-Print Archive

Uncertainty in Natural Language Generation: From Theory to Applications

Author: Aziz Wilker
Baan Joris
Daheim Nico
Fernández Raquel
Ilia Evgenia
Li Haau-Sing
Plank Barbara
Sennrich Rico
Ulmer Dennis
Zerva Chrysoula
Publication venue
Publication date: 28/07/2023
Field of study

Recent advances of powerful Language Models have allowed Natural Language Generation (NLG) to emerge as an important technology that can not only perform traditional tasks like summarisation or translation, but also serve as a natural language interface to a variety of applications. As such, it is crucial that NLG systems are trustworthy and reliable, for example by indicating when they are likely to be wrong; and supporting multiple views, backgrounds and writing styles -- reflecting diverse human sub-populations. In this paper, we argue that a principled treatment of uncertainty can assist in creating systems and evaluation protocols better aligned with these goals. We first present the fundamental theory, frameworks and vocabulary required to represent uncertainty. We then characterise the main sources of uncertainty in NLG from a linguistic perspective, and propose a two-dimensional taxonomy that is more informative and faithful than the popular aleatoric/epistemic dichotomy. Finally, we move from theory to applications and highlight exciting research directions that exploit uncertainty to power decoding, controllable generation, self-assessment, selective answering, active learning and more

arXiv.org e-Print Archive

Findings of the WMT 2021 shared task on quality estimation

Author: Blain Frederic
Chaudhary Vishrav
Fomicheva Marina
Li Zhenhao
Martins André
Specia Lucia
Zerva Chrysoula
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 27/09/2021
Field of study

© (2021) The Authors. Published by ACL. This is an open access article available under a Creative Commons licence. The published version can be accessed at the following link on the publisher’s website: http://www.statmt.org/wmt21/pdf/2021.wmt-1.71.pdfWe report the results of the WMT 2021 shared task on Quality Estimation, where the challenge is to predict the quality of the output of neural machine translation systems at the word and sentence levels. This edition focused on two main novel additions: (i) prediction for unseen languages, i.e. zero-shot settings, and (ii) prediction of sentences with catastrophic errors. In addition, new data was released for a number of languages, especially post-edited data. Participating teams from 19 institutions submitted altogether 1263 systems to different task variants and language pairs

Wolverhampton Intellectual Repository and E-theses

MLQE-PE: A multilingual quality estimation and post-editing dataset

Author: Blain Frédéric
Chaudhary Vishrav
Fomicheva Marina
Fonseca Erick
Guzmán Francisco
Lopatina Nina
Martins André FT
Specia Lucia
Sun Shuo
Zerva Chrysoula
Publication venue: 'Center for Open Science'
Publication date: 11/10/2021
Field of study

© 2020 The Authors. For reuse permissions, please contact the Authors.We present MLQE-PE, a new dataset for Machine Translation (MT) Quality Estimation (QE) and Automatic Post-Editing (APE). The dataset contains eleven language pairs, with human labels for up to 10,000 translations per language pair in the following formats: sentence-level direct assessments and post-editing effort, and word-level good/bad labels. It also contains the post-edited sentences, as well as titles of the articles where the sentences were extracted from, and the neural MT models used to translate the text

arXiv.org e-Print Archive

Wolverhampton Intellectual Repository and E-theses

Tilburg University Repository

AUTOMATIC IDENTIFICATION OF TEXTUAL UNCERTAINTY

Author: Zerva Chrysoula
Publication venue
Publication date: 01/08/2019
Field of study

The University of Manchester - Institutional Repository

Paths for uncertainty: Exploring the intricacies of uncertainty identification for news

Author: Ananiadou Sophia
Zerva Chrysoula
Publication venue
Publication date: 05/06/2018
Field of study

The University of Manchester - Institutional Repository

Construction of a biodiversity knowledge repository using a text mining-based framework

Author: Chrysoula Zerva
Riza Batista-Navarro
Sophia Ananiadou
Publication venue
Publication date: 06/03/2020
Field of study

Abstract In our aim to make the information encapsulated by biodiversity literature more accessible and searchable, we have developed a text mining-based framework for automatically transforming text into a structured knowledge repository. A text mining workflow employing information extraction techniques, i.e., named entity recognition and relation extraction, was implemented in the Argo platform and was subsequently applied on biodiversity literature to extract structured information. The resulting annotations were stored in a repository following the emerging Open Annotation standard, thus promoting interoperability with external applications. Accessible as a SPARQL endpoint, the repository supports knowledge discovery over a huge amount of biodiversity literature by retrieving annotations matching user-specified queries

CiteSeerX